232 research outputs found
Probabilistic Multilevel Clustering via Composite Transportation Distance
We propose a novel probabilistic approach to multilevel clustering problems
based on composite transportation distance, which is a variant of
transportation distance where the underlying metric is Kullback-Leibler
divergence. Our method involves solving a joint optimization problem over
spaces of probability measures to simultaneously discover grouping structures
within groups and among groups. By exploiting the connection of our method to
the problem of finding composite transportation barycenters, we develop fast
and efficient optimization algorithms even for potentially large-scale
multilevel datasets. Finally, we present experimental results with both
synthetic and real data to demonstrate the efficiency and scalability of the
proposed approach.Comment: 25 pages, 3 figure
Revisiting Sliced Wasserstein on Images: From Vectorization to Convolution
The conventional sliced Wasserstein is defined between two probability
measures that have realizations as vectors. When comparing two probability
measures over images, practitioners first need to vectorize images and then
project them to one-dimensional space by using matrix multiplication between
the sample matrix and the projection matrix. After that, the sliced Wasserstein
is evaluated by averaging the two corresponding one-dimensional projected
probability measures. However, this approach has two limitations. The first
limitation is that the spatial structure of images is not captured efficiently
by the vectorization step; therefore, the later slicing process becomes harder
to gather the discrepancy information. The second limitation is memory
inefficiency since each slicing direction is a vector that has the same
dimension as the images. To address these limitations, we propose novel slicing
methods for sliced Wasserstein between probability measures over images that
are based on the convolution operators. We derive convolution sliced
Wasserstein (CSW) and its variants via incorporating stride, dilation, and
non-linear activation function into the convolution operators. We investigate
the metricity of CSW as well as its sample complexity, its computational
complexity, and its connection to conventional sliced Wasserstein distances.
Finally, we demonstrate the favorable performance of CSW over the conventional
sliced Wasserstein in comparing probability measures over images and in
training deep generative modeling on images.Comment: 34 pages, 12 figures, 10 table
Energy-Based Sliced Wasserstein Distance
The sliced Wasserstein (SW) distance has been widely recognized as a
statistically effective and computationally efficient metric between two
probability measures. A key component of the SW distance is the slicing
distribution. There are two existing approaches for choosing this distribution.
The first approach is using a fixed prior distribution. The second approach is
optimizing for the best distribution which belongs to a parametric family of
distributions and can maximize the expected distance. However, both approaches
have their limitations. A fixed prior distribution is non-informative in terms
of highlighting projecting directions that can discriminate two general
probability measures. Doing optimization for the best distribution is often
expensive and unstable. Moreover, designing the parametric family of the
candidate distribution could be easily misspecified. To address the issues, we
propose to design the slicing distribution as an energy-based distribution that
is parameter-free and has the density proportional to an energy function of the
projected one-dimensional Wasserstein distance. We then derive a novel sliced
Wasserstein metric, energy-based sliced Waserstein (EBSW) distance, and
investigate its topological, statistical, and computational properties via
importance sampling, sampling importance resampling, and Markov Chain methods.
Finally, we conduct experiments on point-cloud gradient flow, color transfer,
and point-cloud reconstruction to show the favorable performance of the EBSW.Comment: 36 pages, 7 figures, 6 table
Amortized Projection Optimization for Sliced Wasserstein Generative Models
Seeking informative projecting directions has been an important task in
utilizing sliced Wasserstein distance in applications. However, finding these
directions usually requires an iterative optimization procedure over the space
of projecting directions, which is computationally expensive. Moreover, the
computational issue is even more severe in deep learning applications, where
computing the distance between two mini-batch probability measures is repeated
several times. This nested loop has been one of the main challenges that
prevent the usage of sliced Wasserstein distances based on good projections in
practice. To address this challenge, we propose to utilize the
learning-to-optimize technique or amortized optimization to predict the
informative direction of any given two mini-batch probability measures. To the
best of our knowledge, this is the first work that bridges amortized
optimization and sliced Wasserstein generative models. In particular, we derive
linear amortized models, generalized linear amortized models, and non-linear
amortized models which are corresponding to three types of novel mini-batch
losses, named amortized sliced Wasserstein. We demonstrate the favorable
performance of the proposed sliced losses in deep generative modeling on
standard benchmark datasets.Comment: Accepted to NeurIPS 2022, 22 pages, 6 figures, 8 table
- …